|
Configural frequency analysis (CFA) is a method of exploratory data analysis, introduced by Gustav A. Lienert in 1969. The goal of a configural frequency analysis is to detect patterns in the data that occur significantly more (such patterns are called ''Types'') or significantly less often (such patterns are called ''Antitypes'') than expected by chance. Thus, the idea of a CFA is to provide by the identified types and antitypes some insight into the structure of the data. Types are interpreted as concepts which are constituted by a pattern of variable values. Antitypes are interpreted as patterns of variable values that do in general not occur together. ==Basic idea of the CFA algorithm== We explain the basic idea of CFA by a simple example. Assume that we have a data set that describes for each of ''n'' patients if they show certain symptoms ''s''1, ..., ''s''''m''. We assume for simplicity that a symptom is shown or not, i.e. we have a dichotomous data set. Each record in the data set is thus an ''m''-tuple (''x''1, ..., ''x''''m'') where each ''x''''i'' is either equal to 0 (patient does not show symptom ''i'') or 1 (patient does show symptom ''i''). Each such ''m''-tuple is called a ''configuration''. Let ''C'' be the set of all possible configurations, i.e. the set of all possible ''m''-tuples on ''m''. The data set can thus be described by listing the observed frequencies ''f''(''c'') of all possible configurations in ''C''. The basic idea of CFA is to estimate the frequency of each configuration under the assumption that the ''m'' symptoms are statistically independent. Let ''e''(''c'') be this estimated frequency under the assumption of independence. Let ''p''''i''(1) be the probability that a member of the investigated population shows symptom ''si'' and ''p''''i''(0) be the probability that a member of the investigated population does not show symptom ''si''. Under the assumption that all symptoms are independent we can calculate the expected relative frequency of a configuration ''c'' = (''c''1 , ..., ''c''''m'') by: : Now ''f''(''c'') and ''e''(''c'') can be compared by a statistical test (typical tests applied in CFA are Pearson's chi-squared test, the binomial test or the hypergeometric test of Lehmacher). If the statistical test suggests for a given -level that the difference between ''f''(''c'') and ''e''(''c'') is significant then ''c'' is called a ''type'' if ''f''(''c'') > ''e''(''c'') and is called an antitype if ''f''(''c'') < ''e''(''c''). If there is no significant difference between ''f''(''c'') and ''e''(''c''), then ''c'' is neither a type nor an antitype. Thus, each configuration ''c'' can have in principle three different states. It can be a type, an antitype, or not classified. Types and antitypes are defined symmetrically. But in practical applications researchers are mainly interested to detect types. For example, clinical studies are typically interested to detect symptom combinations that are indicators for a disease. These are by definition symptom combinations which occur more often than expected by chance, i.e. types. 抄文引用元・出典: フリー百科事典『 ウィキペディア(Wikipedia)』 ■ウィキペディアで「Configural frequency analysis」の詳細全文を読む スポンサード リンク
|